Internet Surfer 2.0

home *** CD-ROM | disk | FTP | other *** search

/ Internet Surfer 2.0 / Internet Surfer 2.0 (Wayzata Technology) (1996).iso / pc / text / mac / faqs.222 < prev next >

Wrap

Text File | 1996-02-12 | 28KB | 794 lines

Frequently Asked Questions (FAQS);faqs.222 p = (char *)((int *)p + 1); , or simply p += sizeof(int); References: ANSI Sec. 3.3.4, Rationale Sec. 3.3.2.4 p. 43. Section 3. Memory Allocation 3.1: Why doesn't this fragment work? char *answer; printf("Type something:\n"); gets(answer); printf("You typed \"%s\"\n", answer); A: The pointer variable "answer," which is handed to the gets function as the location into which the response should be stored, has not been set to point to any valid storage. That is, we cannot say where the pointer "answer" points. (Since local variables are not initialized, and typically contain garbage, it is not even guaranteed that "answer" starts out as a null pointer. See question 17.1.) The simplest way to correct the question-asking program is to use a local array, instead of a pointer, and let the compiler worry about allocation: #include <string.h> char answer[100], *p; printf("Type something:\n"); fgets(answer, 100, stdin); if((p = strchr(answer, '\n')) != NULL) *p = '\0'; printf("You typed \"%s\"\n", answer); Note that this example also uses fgets instead of gets (always a good idea), so that the size of the array can be specified, so that fgets will not overwrite the end of the array if the user types an overly-long line. (Unfortunately for this example, fgets does not automatically delete the trailing \n, as gets would.) It would also be possible to use malloc to allocate the answer buffer, and/or to parameterize its size (#define ANSWERSIZE 100). 3.2: I can't get strcat to work. I tried char *s1 = "Hello, "; char *s2 = "world!"; char *s3 = strcat(s1, s2); but I got strange results. A: Again, the problem is that space for the concatenated result is not properly allocated. C does not provide an automatically- managed string type. C compilers only allocate memory for objects explicitly mentioned in the source code (in the case of "strings," this includes character arrays and string literals). The programmer must arrange (explicitly) for sufficient space for the results of run-time operations such as string concatenation, typically by declaring arrays, or by calling malloc. strcat performs no allocation; the second string is appended to the first one, in place. Therefore, one fix would be to declare the first string as an array with sufficient space: char s1[20] = "Hello, "; Since strcat returns the value of its first argument (s1, in this case), the s3 variable is superfluous. Reference: CT&P Sec. 3.2 p. 32. 3.3: But the man page for strcat says that it takes two char *'s as arguments. How am I supposed to know to allocate things? A: In general, when using pointers you _always_ have to consider memory allocation, at least to make sure that the compiler is doing it for you. If a library routine's documentation does not explicitly mention allocation, it is usually the caller's problem. The Synopsis section at the top of a Unix-style man page can be misleading. The code fragments presented there are closer to the function definition used by the call's implementor than the invocation used by the caller. In particular, many routines which accept pointers (e.g. to structs or strings), are usually called with the address of some object (a struct, or an array -- see questions 2.3 and 2.4.) Another common example is stat(). 3.4: I have a function that is supposed to return a string, but when it returns to its caller, the returned string is garbage. A: Make sure that the memory to which the function returns a pointer is correctly allocated. The returned pointer should be to a statically-allocated buffer, or to a buffer passed in by the caller, but _not_ to a local array. See also question 17.3. 3.5: You can't use dynamically-allocated memory after you free it, can you? A: No. Some early man pages for malloc stated that the contents of freed memory was "left undisturbed;" this ill-advised guarantee was never universal and is not required by ANSI. Few programmers would use the contents of freed memory deliberately, but it is easy to do so accidentally. Consider the following (correct) code for freeing a singly-linked list: struct list *listp, *nextp; for(listp = base; listp != NULL; listp = nextp) { nextp = listp->next; free((char *)listp); } and notice what would happen if the more-obvious loop iteration expression listp = listp->next were used, without the temporary nextp pointer. References: ANSI Rationale Sec. 4.10.3.2 p. 102; CT&P Sec. 7.10 p. 95. 3.6: How does free() know how many bytes to free? A: The malloc/free package remembers the size of each block it allocates and returns, so it is not necessary to remind it of the size when freeing. 3.7: So can I query the malloc package to find out how big an allocated block is? A: Not portably. 3.8: Is it legal to pass a null pointer as the first argument to realloc()? Why would you want to? A: ANSI C sanctions this usage (and the related realloc(..., 0), which frees), but several earlier implementations do not support it, so it is not widely portable. Passing an initially-null pointer to realloc can make it easier to write a self-starting incremental allocation algorithm. References: ANSI Sec. 4.10.3.4 . 3.9: What is the difference between calloc and malloc? Is it safe to use calloc's zero-fill guarantee for pointer and floating-point values? Does free work on memory allocated with calloc, or do you need a cfree? A: calloc(m, n) is essentially equivalent to p = malloc(m * n); memset(p, 0, m * n); The zero fill is all-bits-zero, and does not therefore guarantee useful zero values for pointers (see section 1 of this list) or floating-point values. free can (and should) be used to free the memory allocated by calloc. References: ANSI Secs. 4.10.3 to 4.10.3.2 . 3.10: What is alloca and why is its use discouraged? A: alloca allocates memory which is automatically freed when the function which called alloca returns. That is, memory allocated with alloca is local to a particular function's "stack frame" or context. alloca cannot be written portably, and is difficult to implement on machines without a stack. Its use is problematical (and the obvious implementation on a stack-based machine fails) when its return value is passed directly to another function, as in fgets(alloca(100), 100, stdin). For these reasons, alloca cannot be used in programs which must be widely portable, no matter how useful it might be. References: ANSI Rationale Sec. 4.10.3 p. 102. Section 4. Expressions 4.1: Why doesn't this code: a[i] = i++; work? A: The subexpression i++ causes a side effect -- it modifies i's value -- which leads to undefined behavior if i is also referenced elsewhere in the same expression. References: ANSI Sec. 3.3 p. 39. 4.2: Under my compiler, the code int i = 7; printf("%d\n", i++ * i++); prints 49. Regardless of the order of evaluation, shouldn't it print 56? A: Although the postincrement and postdecrement operators ++ and -- perform the operations after yielding the former value, the implication of "after" is often misunderstood. It is _not_ guaranteed that the operation is performed immediately after giving up the previous value and before any other part of the expression is evaluated. It is merely guaranteed that the update will be performed sometime before the expression is considered "finished" (before the next "sequence point," in ANSI C's terminology). In the example, the compiler chose to multiply the previous value by itself and to perform both increments afterwards. The behavior of code which contains multiple, ambiguous side effects has always been undefined. Don't even try to find out how your compiler implements such things (contrary to the ill- advised exercises in many C textbooks); as K&R wisely point out, "if you don't know _how_ they are done on various machines, that innocence may help to protect you." References: K&R I Sec. 2.12 p. 50; K&R II Sec. 2.12 p. 54; ANSI Sec. 3.3 p. 39; CT&P Sec. 3.7 p. 47; PCS Sec. 9.5 pp. 120-1. (Ignore H&S Sec. 7.12 pp. 190-1, which is obsolete.) 4.3: But what about the &&, ||, and comma operators? I see code like "if((c = getchar()) == EOF || c == '\n')" ... A: There is a special exception for those operators, (as well as ?: ); each of them does imply a sequence point (i.e. left-to- right evaluation is guaranteed). Any book on C should make this clear. References: K&R I Sec. 2.6 p. 38, Secs. A7.11-12 pp. 190-1; K&R II Sec. 2.6 p. 41, Secs. A7.14-15 pp. 207-8; ANSI Secs. 3.3.13 p. 52, 3.3.14 p. 52, 3.3.15 p. 53, 3.3.17 p. 55, CT&P Sec. 3.7 pp. 46-7. 4.4: If I'm not using the value of the expression, should I use i++ or ++i to increment a variable? A: Since the two forms differ only in the value yielded, they are entirely equivalent when only their side effect is needed. Some people will tell you that in the old days one form was preferred over the other because it utilized a PDP-11 autoincrement addressing mode, but those people are confused. 4.5: Why doesn't the code int a = 1000, b = 1000; long int c = a * b; work? A: Under C's integral promotion rules, the multiplication is carried out using int arithmetic, and the result may overflow and/or be truncated before being assigned to the long int left- hand-side. Use an explicit cast to force long arithmetic: long int c = (long int)a * b; Section 5. ANSI C 5.1: What is the "ANSI C Standard?" A: In 1983, the American National Standards Institute commissioned a committee, X3J11, to standardize the C language. After a long, arduous process, including several widespread public reviews, the committee's work was finally ratified as an American National Standard, X3.159-1989, on December 14, 1989, and published in the spring of 1990. For the most part, ANSI C standardizes existing practice, with a few additions from C++ (most notably function prototypes) and support for multinational character sets (including the much-lambasted trigraph sequences). The ANSI C standard also formalizes the C run-time library support routines. The published Standard includes a "Rationale," which explains many of its decisions, and discusses a number of subtle points, including several of those covered here. (The Rationale is "not part of ANSI Standard X3.159-1989, but is included for information only.") The Standard has been adopted as an international standard, ISO/IEC 9899:1990, although the sections are numbered differently (briefly, ANSI sections 2 through 4 correspond roughly to ISO sections 5 through 7), and the Rationale is currently not included. 5.2: How can I get a copy of the Standard? A: Copies are available from American National Standards Institute 11 W. 42nd St., 13th floor New York, NY 10036 USA (+1) 212 642 4900 or Global Engineering Documents 2805 McGaw Avenue Irvine, CA 92714 USA (+1) 714 261 1455 (800) 854 7179 (U.S. & Canada) The cost from ANSI is $50.00, plus $6.00 shipping. Quantity discounts are available. (Note that ANSI derives revenues to support its operations from the sale of printed standards, so electronic copies are _not_ available.) The Rationale, by itself, has been printed by Silicon Press, ISBN 0-929306-07-4. 5.3: Does anyone have a tool for converting old-style C programs to ANSI C, or vice versa, or for automatically generating prototypes? A: Two programs, protoize and unprotoize, convert back and forth between prototyped and "old style" function definitions and declarations. (These programs do _not_ handle full-blown translation between "Classic" C and ANSI C.) These programs exist as patches to the FSF GNU C compiler, gcc. Look for the file protoize-1.39.0.5.Z in pub/gnu at prep.ai.mit.edu (18.71.0.38), or at several other FSF archive sites. The unproto program (/pub/unix/unproto4.shar.Z on ftp.win.tue.nl) is a filter which sits between the preprocessor and the next compiler pass, converting most of ANSI C to traditional C on-the-fly. The GNU GhostScript package comes with a little program called ansi2knr. Several prototype generators exist, many as modifications to lint. Version 3 of CPROTO was posted to comp.sources.misc in March, 1992. (See also question 17.8.) 5.4: I'm trying to use the ANSI "stringizing" preprocessing operator # to insert the value of a symbolic constant into a message, but it keeps stringizing the macro's name rather than its value. A: You must use something like the following two-step procedure to force the macro to be expanded as well as stringized: #define str(x) #x #define xstr(x) str(x) #define OP plus char *opname = xstr(OP); This sets opname to "plus" rather than "OP". An equivalent circumlocution is necessary with the token-pasting operator ## when the values (rather than the names) of two macros are to be concatenated. References: ANSI Sec. 3.8.3.2, Sec. 3.8.3.5 example p. 93. 5.5: What's the difference between "char const *p" and "char * const p"? A: "char const *p" is a pointer to a constant character (you can't change the character); "char * const p" is a constant pointer to a (variable) character (i.e. you can't change the pointer). (Read these "inside out" to understand them. See question 10.3.) References: ANSI Sec. 3.5.4.1 . 5.6: My ANSI compiler complains about a mismatch when it sees extern int func(float); int func(x) float x; {... A: You have mixed the new-style prototype declaration "extern int func(float);" with the old-style definition "int func(x) float x;". Old C (and ANSI C, in the absence of prototypes, and in variable-length argument lists) "widens" certain arguments when they are passed to functions. floats are promoted to double, and characters and short integers are promoted to integers. (The values are automatically coerced back to the corresponding narrower types within the body of the called function, if they are declared that way there.) The problem can be fixed either by using new-style syntax consistently in the definition: int func(float x) { ... } or by changing the new-style prototype declaration to match the old-style definition: extern int func(double); (In this case, it would be clearest to change the old-style definition to use double as well, as long as the address of that parameter is not taken.) It may also be safer to avoid "narrow" (char, short int, and float) function arguments and return types. References: ANSI Sec. 3.3.2.2 . 5.7: I'm getting strange syntax errors inside code which I've #ifdeffed out. A: Under ANSI C, the text inside a "turned off" #if, #ifdef, or #ifndef must still consist of "valid preprocessing tokens." This means that there must be no unterminated comments or quotes (note particularly that an apostrophe within a contracted word could look like the beginning of a character constant), and no newlines inside quotes. Therefore, natural-language comments and pseudocode should always be written between the "official" comment delimiters /* and */. (But see also question 17.10, and 6.7.) References: ANSI Sec. 2.1.1.2 p. 6, Sec. 3.1 p. 19 line 37. 5.8: Can I declare main as void, to shut off these annoying "main returns no value" messages? (I'm calling exit(), so main doesn't return.) A: No. main must be declared as returning an int, and as taking either zero or two arguments (of the appropriate type). If you're calling exit() but still getting warnings, you'll have to insert a redundant return statement (or use some kind of "notreached" directive, if available). References: ANSI Sec. 2.1.2.2.1 pp. 7-8. 5.9: Why does the ANSI Standard not guarantee more than six monocase characters of external identifier significance? A: The problem is older linkers which are neither under the control of the ANSI standard nor the C compiler developers on the systems which have them. The limitation is only that identifiers be _significant_ in the first six characters, not that they be restricted to six characters in length. This limitation is annoying, but certainly not unbearable, and is marked in the Standard as "obsolescent," i.e. a future revision will likely relax it. This concession to current, restrictive linkers really had to be made, no matter how vehemently some people oppose it. (The Rationale notes that its retention was "most painful.") If you disagree, or have thought of a trick by which a compiler burdened with a restrictive linker could present the C programmer with the appearance of more significance in external identifiers, read the excellently-worded section 3.1.2 in the X3.159 Rationale (see question 5.1), which discusses several such schemes and explains why they could not be mandated. References: ANSI Sec. 3.1.2 p. 21, Sec. 3.9.1 p. 96, Rationale Sec. 3.1.2 pp. 19-21. 5.10: What is the difference between memcpy and memmove? A: memmove offers guaranteed behavior if the source and destination arguments overlap. memcpy makes no such guarantee, and may therefore be more efficiently implementable. When in doubt, it's safer to use memmove. References: ANSI Secs. 4.11.2.1, 4.11.2.2, Rationale Sec. 4.11.2 . 5.11: My compiler is rejecting the simplest possible test programs, with all kinds of syntax errors. A: Perhaps it is a pre-ANSI compiler, unable to accept function prototypes and the like. 5.12: Why won't the Frobozz Magic C Compiler, which claims to be ANSI compliant, accept this code? I know that the code is ANSI, because gcc accepts it. A: Most compilers support a few non-Standard extensions, gcc more so than most. Are you sure that the code being rejected doesn't rely on such an extension? It is usually a bad idea to perform experiments with a particular compiler to determine properties of a language; the applicable standard may permit variations, or the compiler may be wrong. 5.13: What are #pragmas and what are they good for? A: The #pragma directive provides a single, well-defined "escape hatch" which can be used for all sorts of implementation- specific controls and extensions: source listing control, structure packing, warning suppression (like the old lint /* NOTREACHED */ comments), etc. References: ANSI Sec. 3.8.6 . Section 6. C Preprocessor 6.1: How can I write a generic macro to swap two values? A: There is no good answer to this question. If the values are integers, a well-known trick using exclusive-OR could perhaps be used, but it will not work for floating-point values or pointers, or if the two values are the same variable (and the "obvious" supercompressed implementation for integral types a^=b^=a^=b is in fact illegal due to multiple side-effects, and...). If the macro is intended to be used on values of arbitrary type (the usual goal), it cannot use a temporary, since it does not know what type of temporary it needs, and standard C does not provide a typeof operator. The best all-around solution is probably to forget about using a macro, unless you're willing to pass in the type as a third argument. 6.2: I have some old code that tries to construct identifiers with a macro like #define Paste(a, b) a/**/b but it doesn't work any more. A: That comments disappeared entirely and could therefore be used for token pasting was an undocumented feature of some early preprocessor implementations, notably Reiser's. ANSI affirms (as did K&R) that comments are replaced with white space. However, since the need for pasting tokens was demonstrated and real, ANSI introduced a well-defined token-pasting operator, ##, which can be used like this: #define Paste(a, b) a##b (See also question 5.4.) Reference: ANSI Sec. 3.8.3.3 p. 91, Rationale pp. 66-7. 6.3: What's the best way to write a multi-statement cpp macro? A: The usual goal is to write a macro that can be invoked as if it were a single function-call statement. This means that the "caller" will be supplying the final semicolon, so the macro body should not. The macro body cannot be a simple brace- delineated compound statement, because syntax errors would result if it were invoked (apparently as a single statement, but with a resultant extra semicolon) as the if branch of an if/else statement with an explicit else clause. The traditional solution is to use #define Func() do { \ /* declarations */ \ stmt1; \ stmt2; \ /* ... */ \ } while(0) /* (no trailing ; ) */ When the "caller" appends a semicolon, this expansion becomes a single statement regardless of context. (An optimizing compiler will remove any "dead" tests or branches on the constant condition 0, although lint may complain.) If all of the statements in the intended macro are simple expressions, with no declarations or loops, another technique is to write a single, parenthesized expression using one or more comma operators. (See the example under question 6.8 below. This technique also allows a value to be "returned.") Reference: CT&P Sec. 6.3 pp. 82-3. 6.4: Is it acceptable for one header file to #include another? A: There has been considerable debate surrounding this question. Many people believe that "nested #include files" are to be avoided: the prestigious Indian Hill Style Guide (see question 14.3) disparages them; they can make it harder to find relevant definitions; they can lead to multiple-declaration errors if a file is #included twice; and they make manual Makefile maintenance very difficult. On the other hand, they make it possible to use header files in a modular way (a header file #includes what it needs itself, rather than requiring each #includer to do so, a requirement that can lead to intractable headaches); a tool like grep (or a tags file) makes it easy to find definitions no matter where they are; a popular trick: #ifndef HEADER_FILE_NAME #define HEADER_FILE_NAME ...header file contents... #endif makes a header file "idempotent" so that it can safely be #included multiple times; and automated Makefile maintenance tools (which are a virtual necessity in large projects anyway) handle dependency generation in the face of nested #include files easily. 6.5: Does the sizeof operator work in preprocessor #if directives? A: No. Preprocessing happens during an earlier pass of compilation, before type names have been parsed. Consider using the predefined constants in ANSI's <limits.h>, or a "configure" script, instead. References: ANSI Sec. 2.1.1.2 pp. 6-7, Sec. 3.8.1 p. 87 footnote 83. 6.6: How can I use a preprocessor #if expression to tell if a machine is big-endian or little-endian? A: You probably can't. (Preprocessor arithmetic uses only long ints, and there is no concept of addressing.) Are you sure you need to know the machine's endianness explicitly? Usually it's better to write code which doesn't care. 6.7: I've got this tricky processing I want to do at compile time and I can't figure out a way to get cpp to do it. A: cpp is not intended as a general-purpose preprocessor. Rather than forcing it to do something inappropriate, consider writing your own little special-purpose preprocessing tool, instead. You can easily get a utility like make(1) to run it for you automatically. If you are trying to preprocess something other than C, consider using a general-purpose preprocessor (such as m4). 6.8: How can I write a cpp macro which takes a variable number of arguments? A: One popular trick is to define the macro with a single argument, and call it with a double set of parentheses, which appear to the preprocessor to indicate a single argument: #define DEBUG(args) (printf("DEBUG: "), printf args) if(n != 0) DEBUG(("n is %d\n", n)); The obvious disadvantage is that the caller must always remember to use the extra parentheses. (It is often better to use a bona-fide function, which can take a variable number of arguments in a well-defined way. See questions 7.1 and 7.2 below.) Section 7. Variable-Length Argument Lists 7.1: How can I write a function that takes a variable number of arguments? A: Use the <stdarg.h> header (or, if you must, the older <varargs.h>). Here is a function which concatenates an arbitrary number of strings into malloc'ed memory: #include <stdlib.h> /* for malloc, NULL, size_t */ #include <stdarg.h> /* for va_ stuff */ #include <string.h> /* for strcat et al */ char *vstrcat(char *first, ...) { size_t len = 0; char *retbuf; va_list argp; char *p; if(first == NULL) return NULL; len = strlen(first); va_start(argp, first); while((p = va_arg(argp, char *)) != NULL) len += strlen(p); va_end(argp); retbuf = malloc(len + 1); /* +1 for trailing \0 */ if(retbuf == NULL) return NULL; /* error */ (void)strcpy(retbuf, first); va_start(argp, first); while((p = va_arg(argp, char *)) != NULL) (void)strcat(retbuf, p); va_end(argp); return retbuf; } Usage is something like char *str = vstrcat("Hello, ", "world!", (char *)NULL); Note the cast on the last argument. (Also note that the caller must free the returned, malloc'ed storage.) Under a pre-ANSI compiler, rewrite the function definition without a prototype ("char *vstrcat(first) char *first; {"), include <stdio.h> rather than <stdlib.h>, add "extern char *malloc();", and use int instead of size_t. You may also have to delete the (void) casts, and use the older varargs package instead of stdarg. See the next question for hints. Remember that in variable-length argument lists, function prototypes do not supply parameter type information; therefore, default argument promotions apply (see question 5.6), and null pointer arguments must be typed explicitly (see question 1.2). References: K&R II Sec. 7.3 p. 155, Sec. B7 p. 254; H&S Sec. 13.4 pp. 286-9; ANSI Secs. 4.8 through 4.8.1.3 . 7.2: How can I write a function that takes a format string and a variable number of arguments, like printf, and passes them to printf to do most of the work? A: Use vprintf, vfprintf, or vsprintf. Here is an "error" routine which prints an error message, preceded by the string "error: " and terminated with a newline: #include <stdio.h> #include <stdarg.h> void error(char *fmt, ...) { va_list argp; fprintf(stderr, "error: "); va_start(argp, fmt); vfprintf(stderr, fmt, argp); va_end(argp); fprintf(stderr, "\n"); } To use the older <varargs.h> package, instead of <stdarg.h>, change the function header to: void error(va_alist) va_dcl { char *fmt; change the va_start line to va_start(argp); and add the line fmt = va_arg(argp, char *); between the calls to va_start and vfprintf. (Note that there is no semicolon after va_dcl.) References: K&R II Sec. 8.3 p. 174, Sec. B1.2 p. 245; H&S Sec. 17.12 p. 337; ANSI Secs. 4.9.6.7, 4.9.6.8, 4.9.6.9 . 7.3: How can I discover how many arguments a function was actually called with? A: This information is not available to a portable program. Some systems provide a nonstandard nargs() function, but its use is questionable, since it typically returns the number of words passed, not the number of arguments. (Floating point values and structures are usually passed as several words.) Any function which takes a variable number of arguments must be able to determine from the arguments themselves how many of them there are. printf-like functions do this by looking for formatting specifiers (%d and the like) in the format string (which is why these functions fail badly if the format string does not match the argument list). Another common technique (useful when the arguments are all of the same type) is to use a sentinel value (often 0, -1, or an appropriately-cast null pointer) at the end of the list (see the execl and vstrcat examples under questions 1.2 and 7.1 above). 7.4: How can I write a function which takes a variable number of arguments and passes them to some other function (which takes a variable number of arguments)?